1. Identity statement | |
Reference Type | Conference Paper (Conference Proceedings) |
Site | sibgrapi.sid.inpe.br |
Holder Code | ibi 8JMKD3MGPEW34M/46T9EHH |
Identifier | 8JMKD3MGPAW/3MC59RH |
Repository | sid.inpe.br/sibgrapi/2016/08.31.17.22 |
Last Update | 2016:08.31.17.22.17 (UTC) administrator |
Metadata Repository | sid.inpe.br/sibgrapi/2016/08.31.17.22.17 |
Metadata Last Update | 2022:05.18.22.21.09 (UTC) administrator |
Citation Key | CavalinDornCruz:2016:ClLiEv |
Title | Classification of Life Events on Social Media |
Format | On-line |
Year | 2016 |
Access Date | 2024, Apr. 29 |
Number of Files | 1 |
Size | 192 KiB |
|
2. Context | |
Author | 1 Cavalin, Paulo 2 Dornelas, Fillipe 3 Cruz, Sergio |
Affiliation | 1 IBM Research 2 IBM Research, Universidade Federal Rural do Rio de Janeiro 3 Universidade Federal Rural do Rio de Janeiro |
Editor | Aliaga, Daniel G. Davis, Larry S. Farias, Ricardo C. Fernandes, Leandro A. F. Gibson, Stuart J. Giraldi, Gilson A. Gois, João Paulo Maciel, Anderson Menotti, David Miranda, Paulo A. V. Musse, Soraia Namikawa, Laercio Pamplona, Mauricio Papa, João Paulo Santos, Jefersson dos Schwartz, William Robson Thomaz, Carlos E. |
e-Mail Address | pcavalin@br.ibm.com |
Conference Name | Conference on Graphics, Patterns and Images, 29 (SIBGRAPI) |
Conference Location | São José dos Campos, SP, Brazil |
Date | 4-7 Oct. 2016 |
Publisher | Sociedade Brasileira de Computação |
Publisher City | Porto Alegre |
Book Title | Proceedings |
Tertiary Type | Industry Application Paper |
History (UTC) | 2016-08-31 17:22:17 :: pcavalin@br.ibm.com -> administrator :: 2022-05-18 22:21:09 :: administrator -> :: 2016 |
|
3. Content and structure | |
Is the master or a copy? | is the master |
Content Stage | completed |
Transferable | 1 |
Keywords | Social Media Life Events Classification Umbalanced datasets |
Abstract | In this paper we present an investigation of life event classification on social media networks. Detecting personal mentions about life events, such as travel, birthday, wedding, etc, presents an interesting opportunity to anticipate the offer of products or services, as well to enhance the demographics of a given target population. Nevertheless, life event classification can be seen as an unbalanced classification problem, where the set of posts that actually mention a life event is significantly smaller than those that do not. For this reason, the main goal of this paper is to investigate different types of classifiers, on a experimental protocol based on datasets containing various types of life events in both Portuguese and English languages, and the benefits of over-sampling techniques to improve the accuracy of these classifiers on these sets. The results demonstrate that a Logistic Regression may be a poor choice to deal with the original datasets, but after over-sampling the training set, such classifier is able to outperform by a significant margin other classifiers such as Naive Bayes and Nearest Neighbours, which do not benefit as well from the over-sampled training set in most cases. |
Arrangement | urlib.net > SDLA > Fonds > SIBGRAPI 2016 > Classification of Life... |
doc Directory Content | access |
source Directory Content | there are no files |
agreement Directory Content | |
|
4. Conditions of access and use | |
data URL | http://urlib.net/ibi/8JMKD3MGPAW/3MC59RH |
zipped data URL | http://urlib.net/zip/8JMKD3MGPAW/3MC59RH |
Language | en |
Target File | SibgrapiWIA_LifeEvents_2016_cameraready.pdf |
User Group | pcavalin@br.ibm.com |
Visibility | shown |
Update Permission | not transferred |
|
5. Allied materials | |
Mirror Repository | sid.inpe.br/banon/2001/03.30.15.38.24 |
Next Higher Units | 8JMKD3MGPAW/3M2D4LP |
Citing Item List | sid.inpe.br/sibgrapi/2016/07.02.23.50 9 |
Host Collection | sid.inpe.br/banon/2001/03.30.15.38 |
|
6. Notes | |
Empty Fields | archivingpolicy archivist area callnumber contenttype copyholder copyright creatorhistory descriptionlevel dissemination doi edition electronicmailaddress group isbn issn label lineage mark nextedition notes numberofvolumes orcid organization pages parameterlist parentrepositories previousedition previouslowerunit progress project readergroup readpermission resumeid rightsholder schedulinginformation secondarydate secondarykey secondarymark secondarytype serieseditor session shorttitle sponsor subject tertiarymark type url versiontype volume |
|